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ASYMPTOTIC BEHAVIOR OF THE POISSON DIRICHLET 
DISTRIBUTION FOR LARGE MUTATION RATE 1 

By Donald A. Dawson and Shui Feng 

Carleton University and McMaster University 

The large deviation principle is established for the Poisson-Dirichlet 
distribution when the parameter 9 approaches infinity. The result is 
then used to study the asymptotic behavior of the homozygosity and 
the Poisson-Dirichlet distribution with selection. A phase transition 
occurs depending on the growth rate of the selection intensity. If the 
selection intensity grows sublinearly in 0, then the large deviation 
rate function is the same as the neutral model; if the selection inten- 
sity grows at a linear or greater rate in 6, then the large deviation 
rate function includes an additional term coming from selection. The 
application of these results to the heterozygote advantage model pro- 
vides an alternate proof of one of Gillespie's conjectures in [Theoret. 
Popul. Biol. 55 145-156]. 

1. Introduction. Let 



The Poisson-Dirichlet distribution with parameter 9 > [henceforth de- 
noted by PD{9)\ is a probability measure on V. It was introduced by King- 
man [10] as an asymptotic distribution of the order statistics of a symmetric 
Dirichlet distribution with parameters K, a when K — > oo and a — > in a 
way such that lim^^oo Ka = 6. The distribution coincides with the distri- 
bution of the normalized jump sizes of a Gamma process over the interval 
(0, 9) ranked in descending order. We use P(#) = (Pi(0),P2(9), . . .) to denote 
the V-valued random variable with distribution PD(9). PD{9) appears in 
many different contexts, including Bayesian statistics, number theory, com- 
binatorics and population genetics. In the context of population genetics, 
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the distribution describes the equilibrium proportions of different alleles in 
the infinitely many neutral alleles model. The component Pk{9) represents 
the proportion of the fcth most frequent allele. If u is the individual muta- 
tion rate and N is the effective population size, then the parameter 9 = ANu 
is the population mutation rate. When 9 is small, a large proportion of the 
population tends to concentrate on a small set of alleles, whereas for large 9, 
the population is fairly evenly spread. A more friendly way of describing the 
distribution PD(9) is through the following size-biased sampling process. We 
first let Uk, k = 1, 2, . . . , be a sequence of independent, identically distributed 
random variables with common distribution Beta(l,9). We then generate a 
random probability vector representing allelic frequencies as follows: 

X 1 = U 1 , X n = (l-U 1 )---(l-U n . 1 )U n , n>2. 

In other words, the frequency of the first allele type is chosen at random, 
this is removed and the relative frequency of the second allele is chosen 
in the same way. This pattern is repeated to get all samples. Then the 
frequency of allelic type of the fcth selected sample will be X^. It can be 
shown that X±, X2, ■ ■ ■ reordered in descending order has distribution PD(9). 
The sequence Xk, k = 1, 2, ... , corresponds to the size-biased permutation of 
PD(9) and the representation through Uk,k = 1,2,..., is called the GEM 
representation after R. C. Griffiths, S. Engen and J. W. McCloskey. 

Consider a population under the influence of mutation and selection. The 
role of mutation is to bring in new types of alleles and reduce the propor- 
tion of existing alleles, while the selection force favors certain genotypes 
and, thus, alters allele proportions. It is interesting to understand how the 
mutation and selection forces interact. The limiting procedure with 9 ap- 
proaching infinity is equivalent to letting the population size go to infinity. 
By the study of the behavior of PD(9) for large 9, one would hope to get a 
better picture of interaction between mutation and selection. For the over- 
dominance model, where the heterozygote has advantage over homozygote, 
it is observed in [5] that, when both the mutation rate and the selection 
rate are scaled by large 9, the model behaves the same as a neutral model. 
This was confirmed later by Joyce, Krone and Kurtz [9] through the study 
of the stationary distribution of the infinitely many alleles diffusion with 
heterozygote advantage. A critical growth rate # 3 / 2 is identified such that 
selection will not be detected if its rate grows more slowly than the critical 
rate. 

Let k = 1, . . . , be a sequence of i.i.d. random variables with common 
diffusive distribution v on [0, 1], that is, v({x)} = for every x in [0, 1]. Set 

00 

n = £p fc (0)<%. 

k=l 
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It is known that the law of II is the Dirichlet{v) distribution, and is the 
stationary distribution of the Fleming- Viot process with mutation operator 



Dawson and Feng [1, 2] studied the asymptotic behavior of II for large 8 and 
established the large deviation principle (henceforth LDP) for the law of II. 
It is worth noting that there are fundamental differences between II and 
P(#) even though their laws are both called Poisson-Dirichlet distribution 
in the literature. A detailed discussion is given in Section 4. 

The main result of this article is the LDP for PD{6) for large 8. When 8 
approaches infinity, Pk{8) converges to zero for every k. Since J2T=i Pk = 1) 
the allele proportions are evenly spread out for large 8. We will see from the 
LDP that, at the exponential scale, the differences between different allele 
proportions are still significant. 

Our first result is the LDP for the law of P\{8). This is then used to 
derive the LDP for finite marginal distributions of P(#), namely, the law of 
(Pi(8), . . . ,P n (8)) for every n. These eventually lead to the establishment of 
the LDP for the law of P(#). All rate functions have explicit forms. 

In Section 2 we review several general results on LDP, and formulate a 
comparison lemma. Some estimates on the Beta distribution are proved in 
Section 3. Our main LDP results for PD{8) are formulated and proved in 
Section 4. In Section 5 the LDP result for PD{8) is used to derive the LDP 
of the homozygosity and the PD{8) with selection. Our result shows that 
a phase transition occurs with parameter given by the selection intensity. 
Let the selection be scaled by # 7 . Then for the selection to be detected at 
the large deviation scale, 7 has to be greater than or equal to 1. For the 
heterozygote advantage model, this provides a new proof of a conjecture in 
[5]. In the LDP setting the critical scale turns out to be 8 instead of 8 3 ^ 2 
which was obtained in the case of Joyce, Krone and Kurtz [9]. 

The study of the behavior of P(0) = (Pi(8), P 2 (8), . . .) for large 8 has a 
long history. In Waterson and Guess [17] E[Pi(8)] was shown to be asymp- 
totically log8/8. Griffiths [6] obtained the explicit weak limit of 8P(8) and 
a central limit theorem for the homozygosity. A more detailed description of 
these results and their relation to our results will be included in Section 4. 

One may be able to generalize our result to the two-parameter Poisson- 
Dirichlet distribution studied in [12]. The residual allocation model now 
involves two parameters 8 + a > 0, < a < 1, such that U/. is a Beta(l — 
a, 8 + ka) random variable for each k. Since the mutation force becomes 
stronger with the introduction of a, one expects the speed of convergence 
will be higher than that of PD(8). For a more comprehensive discussion on 
PD(8) and its two-parameter counterpart, we recommend [13, 14] and the 
references therein. 
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2. Preliminaries. We include several known results on LDP in this sec- 
tion. A comparison lemma will be formulated as a direct application of the 
Gartner-Ellis theorem. All results will be stated in the form that is suffi- 
cient for our purposes. For the most general form, we refer to [3]. Let E be 
a complete separable metric space with metric p. 

Definition 2.1. A family of probability measures {Q £ :e > 0} on £ is 
said to satisfy an LDP with speed 1/e and rate function /(•) if, for any 
closed set F and open set G in £, 

lim sup e log Q E {F} < — inf I(x), 

liminf e\ogQ £ {G} > — inf I(x), 

for any c > 0, {x : I{x) < c} is compact. 

Definition 2.2. A family of probability measures {Q £ :e > 0} is said 
to satisfy a partial LDP if, for every sequence e n converging to zero, there is 
a subsequence e' n such that the family {Q £ ' n ■ e' n > 0} satisfies an LDP with 
speed l/e' n and rate function I'. 

Remark. A partial LDP will become an LDP if all the rate functions 
I' are the same. The following result is found in [15]. 

Theorem 2.1 (Pukhalskii). (i) Assume that {Q £ :e > 0} satisfies a par- 
tial LDP with speed 1/e and for every x in E, 

lim limsupelogQ e { / o(y,x) < 5} 

(i) 

= limliminfelog(5 e {p(?/,x) < 6} = —I(x). 

<5^0 s— >0 

Then {Q £ :e > 0} satisfies an LDP with speed 1/e and rate function /(•). 

(ii) If E is compact, then the partial large deviation principle is automat- 
ically satisfied. 

Theorem 2.2 (Varadhan). Assume that {Q £ :e>0} satisfy an LDP 
with speed 1/e and a rate function /(•). Let Cb(E) denote the set of bounded 
continuous functions on E. Then for any (j){x) in Cb(E), one has 

(2) = lim elog E Qs [e^V 5 ] = sup{<P(x) - I(x)}. 

x&E 

Theorem 2.3 (Gartner-Ellis). Let E = R = (-00,00). Assume that 
A(A) = lim e log E Q - [e Xx/e ] < 00 for all A, 
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and has the first-order derivative A' (A). Then {Q £ :e > 0} satisfies an LDP 
with speed 1/e and rate function 

I{x) = sup{Ax - A(A)}. 

As a direct application of Theorem 2.3, we get the following: 

Lemma 2.4 (Comparison lemma). Let E = (—00, 00). Assume that {X £ : 
e > 0}, {Y £ :e > 0}, {Z £ :e > 0} are three families of random variables on 
the same probability space with respective laws {Q\:e > 0}, {Ql'-e > 0}, 
{Qe : e > 0}. // both {Q\ : e > 0} and {Q^ : e > 0} satisfy the assumptions in 
Theorem 2.3 with the same A, and with probability one 

X £ <Y £ <Z £ , 

then {Q £ :e > 0} satisfies an LDP with speed 1/e and rate function 

I(x) = sup{Ax - A(A)}. 

Remark. Both Theorem 2.3 and Lemma 2.4 hold if E is only a closed 
subset of R. 

3. Some estimates for the beta distribution. Let U\, U2, ... be a sequence 
of i.i.d. random variables with common distribution Beta(l,6). Let E = 
[0, 1]. Then we have the following: 

Lemma 3.1. For any n > 1, let Q n fi be the law of Z n = max{£/i, . . . , U n }. 
Then the family {Q Ui o : > 0} satisfies an LDP on E with speed 9 and rate 
function 

(3) /(*) = ( log T^> XG[0 ' 1} ' 

[ 00, else. 

Proof. Let 

A(A) = esssup{Ay + log(l — y)} 
ye[o,i] 

(4) fA-l-logA, A>1, 

\ 0, else. 

Then clearly A(A) is finite for all A and is differentiable. By direct calculation, 
we have 



E[e exz "}= f\Mm{y)}dy, 
Jo 



G 
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where 

i^)=A, + iosn + i0 ^ 

(5) 9 

+ lj± l g[l - (1 - y)°] + LA l og (l _ y) . 

Letting 9 go to infinity, we get 

lim log{E[e 0XZn ]} 1/e = A(A) 

which, combined with Theorem 2.3 (with e = 1/9), implies the lemma. □ 

Lemma 3.2. For any k>l, let n^{9) denote the integer part of9 k . Then 
the family {Q nk rg\g ■ 9 > 0} satisfies an LDP with speed 9 and rate function 
I(-) defined in (3). 

Proof. Choosing n = nk{9) in Lemma 3.1, we get 

P / n x , (logn fc (g)+logg) 
-^e(y) = Ay H 

+ log[l - (l - y) 9 ] + ^ log(l - y). 

For any e in (0, 1/2), and A > 0, we have 

A(A)= hm l -\ogE[e exu ^] 

v— »oo (7 

< limsup-log£[e 0AZ "*W] 



< max< Ae, ess sup[Ay + log(l — y)\ > , 
L j/>e ' J 



where the last inequality follows from the fact that, for y in [e, 1] , lim^oo 9 l x 
log[l — (1 — y) e ] = for any I > 1. Letting e go to zero, it follows that 

A(A)= lim \\ogE[e dXZn k(e)}. 

0—>oo 9 

For negative A, we have 

limsup - logE[e dXZn kW] > lim esssup{Ay + log(l - y)} = = A(A). 

e^oo 9 s-*o y >s 

The lemma follows from Theorem 2.3. □ 
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Lemma 3.3. For any n> 1, let W n = (1 - Ui)(l - U 2 ) ■ ■ ■ (1 - U n ). Then 
for any 6 > 0, 

(6) limsup- log P{W n2 (0) > S} = -oo. 



Proof. By direct calculation, 

( Ms) 

P{w Me) >5} = p^e J2 io g (i - Uj) > eio g 5j 

< e eiogl/<5/ £ ir e eiog(l-C/i)-|xn2(0) 



e 



e log 1/5 f l\ n2(e) 
1 



log-^-(0 2 -l)log2 



= exp 

(6) follows by letting 9 go to infinity. □ 

4. LDP for the Poisson Dirichlet distribution. Let X(0) = (X 1 ,X 2 ,...) 
be the GEM, that is, 

(7) X 1 = U 1 , X k = (l-U 1 )---{l-U k - 1 )U k , k>2, 
and 

(8) p(d) = (p 1 (e),p 2 (e),...), 

with Pk(0) the A;th largest component of X(0). The law of P(9) is thus 
PD(0). 

In this section we will establish the LDP for PD(6) when 9 becomes large. 
To help motivate this result, some earlier works on the asymptotic behavior 
of PD{6) are included and their relations to our result are discussed. 

4.1. Scaling limits. Recall that the parameter 9 is the population mu- 
tation rate. In the infinite neutral allele models all mutations produce new 
alleles. It is thus reasonable to expect that the higher the mutation rate, the 
smaller the proportion of most frequent allele will be. 

In [17], the exact expression and asymptotic expression were obtained for 
E[Pi{9)\. In particular, they showed that 
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which implies that limg^oo Pk{9) = for each k. Since YlkLiPk(6) = 1, it 
follows that the differences between the proportions {Pk(6) : k > 1} become 
smaller when 9 becomes large. 

Griffiths [6] generalized the result in [17], and obtained expressions for 
the expectations and variances of Pk{9) for each k. The moments of Pk{9) 
for any k > 1 were given by the following: 

where J(u) = ^—dx. In particular, one has, for any k> 1, 

WC"))*- 1 -ej(u)] 
. (fc-l)J 

Perman [11] obtained generalizations of (10) to normalized jump sizes of 
subordinators. A scaling limit, to be described below, was also obtained in 
[6]. 

For each r > 1, let oo > Yi > Y2 > • • • > Y > — °° have a joint distribution 
with density 

(11) exp{-( yi + ---+y r )-e~ yr }. 

It is clear that the marginal density of is 



roc 

E[P k (8)} = / e 
Jo 



du — > as 8 — > oo. 



(12) 



1 



(fc-1)! 



exp{-(/cy + e y )}. 



For = 1, . . . , r, set 



(13) /3(e)=log^ + loglog0, Y k (6) = ePk(0)-P(d). 

Theorem 4.1 (Griffiths). For each r > 1, (Yi(0), . . . , Y r {6)) converges 
weakly to (Y%, . .. ,Y r ) when 9 goes to infinity. 

The result (9) can be viewed as a kind of law of large numbers and The- 
orem 4.1 as a "central limit" type theorem. This brings us naturally to the 
study of large deviations in the next subsection. 

4.2. Large deviations. There are two different versions of the infinitely- 
many-neutral-alleles model: one is a special Fleming- Viot process with par- 
ent independent mutation operator with mutation rate 9 and mutation prob- 
ability u, and the other is an infinite-dimensional diffusion process with state 
space V and generator 

1 00 <9 2 9 00 d 
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defined on an appropriate domain. The Fleming- Viot version is called the 
labeled version and the second version is called unlabeled. Fundamental dif- 
ferences exist between the two versions. For example, the labeled version 
does not have a transition density, while the unlabeled version does; the un- 
labeled version has one less eigenvalue than the labeled version (see [4]). But 
both models are reversible with respective reversible measures Dirichlet{u) 
and PD(9). Let Mi([0, 1]) be the set of all probability measures on [0,1]. 
If we introduce the map $ between Mi([0, 1]) and the closure of V in R°° 
such that ^(/i) is the descending sequence of masses of the atoms of fi, then 
the unlabeled model is just the image of the labeled Fleming- Viot model 
under <£. Thus, many properties of the unlabeled version can be derived 
from the labeled one. Since the LDP for Dirichlet{v) has been established 
in [1, 2], one would hope to get the LDP for PD{9) from the LDP for 
Dirichlet{u) through <I>. Unfortunately, <I> is not continuous as the follow- 
ing example shows. Let [i n = ^J2k=i^k/n 2 - Then [i n converges weakly to 
5o, while $(/i n ) = (^, . . . , ^, 0, . . .) converges to (0, ...0,...) rather than 
(1, 0, . . .) = &(5o). To use the contraction principle in large deviation theory, 
one has to prove some exponential approximation to <3? by a sequence of 
continuous maps. We choose to prove the LDP for PD(8) directly. 
Our first theorem gives the large deviations of Pi{9). 

Lemma 4.2. The family of the laws of P±(9) satisfies an LDP on [0, 1] 
with speed 9 and rate function /(•) [given by (3)]. 

Proof. Let P^O) = max{Xi, . . . , X n2{e) }. Then clearly Pi(0) > P^O). 
By Lemma 3.3, for any 5 > 0, one has 

limsup \ log P{P l (9) - A (9) >5}< limsup \ log P{ W n2{e) > 5} = -oo. 

In other words, P\{9) and P\{9) are exponentially equivalent and, thus, have 
the same LDPs. By definition, we have 



Applying Lemmas 3.1, 3.2 and 2.4, we conclude that the law of P\{9) satisfies 
an LDP with speed 9 and rate function /(•). □ 



(15) E n> g is the law of (P\(9), . . . ,P n (9)) on space V n , n> 1. 



U 1 =X 1 <P 1 (9)<Z n2{e) . 



Let 



(14) 




and 
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Theorem 4.3. For fixed n > 2, the family {E n> g : 9 > 0} satisfies an LDP 
with speed 8 and rate function 



(16) gn(pi,-, P n) = | l0S l-Sg =lPfc ' (Pl.-^)eV„,EP*<l. 

oo, e/se. 



Proof. Since V n is compact, by Theorem 2.1(ii), the family {H n> e :9 > 
0} satisfies a partial LDP. Let gf denote the density function of P\(9). Then 
for any p G (0, 1), 

r(p/(l-p))Al 

(17) g{ip)piX-pf- e = gi(x)dx, 



and the joint density function g e n of (Pi (0), . . . , P n (9)) obtained in Watterson 
[16] is given by 



-\i-Y. n k - = \pk) e - 2 of p 



(18) <?» (PI, • • • ,Pn) = V ^™ g . 

pi-'Pn-i Vi-ELi^fc 

for (pi,...,p n ) G V° = {(pi,...,p n ) e V„:0 < p n < ••• < Pi < I.ELlPfc < 
1}, and is zero otherwise. In other words, for any fixed (p±, . . . ,p n ) G V° , we 
have 

(19 ) g„(pi,...,Pn = / g 1 (u)du. 

Pl-'-Pn JO 

Clearly, V n is the closure of V° . Now for any (pi,... ,p n ) G V n , let 

V((pi,...,p n );S) = {(qi,...,q n ) G V n :\q k -Pk\ < 5,k = 1, . . . ,n}, 

U((pi,...,p n );5) = {(qi,...,q n ) eV n :\q k -p k \ < 5,k = 1, . . . ,n}. 

Then the family {V((pi, . . . ,p n );5) : 6 > 0, (pi, . . . ,p n ) G V n } is a base for the 
topology of V n . Now assume that p n > and 5 is smaller that p n . By (19), 
we have that, for any (qi, ...,q n ) in V((pi, . . . ,Pn),5), 

ft>(«l,..,ftO< {pi _ S) ... {pn _ S) > 

which implies 

1 1 

(20) hnisup-log~ ni0 {[/((pi,...,p n );<5)} <-log- 



i-EJU(p*-*)" 

Letting 5 go to zero, we get 

(21) limsuplimsup-logE n o{?7((pi, . . . ,p n );<5)} < S n (pi, ■ ■ ■ ,Pn)- 
s^o e V 
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Next we turn to lower bound. First noting that, if J2k=\Pk = 1> the lower 
bound is trivially true since S n (p±, . . . ,p n ) =00. Hence, we assume that 
J2k=iPk < 1, Pn > 0. We also assume that < 5 < (1 — J2k=iPk)/ n - Using 
(19) again, one has that, for any (q±, . . . , q n ) in V((pi, . . . ,p n ),$) H V° , 

/ " /■((p»-*)/(i-EZ-i(pfc-«)))Ai „ 

g 9 n(qi,...,qn)>e n {l-Y.(Pk + t)) I 9l(u)du, 



k=l 



which implies 



liminf \ logE nt o{V((pi, . . . ,p n ); 5)} 

> _ log inf fl(p) :p<( ( Pn ~ ^ ) 

= - log ■ 



i-ELi(p* + <5)' 

where in the second line we used the LDP of the law P\{9) obtained in 
Lemma 4.2. Letting 5 go to zero, we get 

(22) liminf liminf - logZ n e{V((p!, . . . ,p n ); 5)} > -S n (pi, . . . ,p n ). 

Finally, we turn to the case when there is 1 < k < n such that Pi > for 
i = 1, . . . k and Pi = for i > k + 1. Because of lower semi-continuity of all rate 
functions in the partial LDP and the continuity of S n (pi,. .. ,p n ), (22) holds 
in this case. On the other hand, noting that S n (pi, . . . ,p n ) = S k (j>i, . . . ,Pk) 
and S ni e{C/((pi, . . . ,p„);<5)} < E k)d {U((pi, . . . ,Pfc);5)}, it follows that the up- 
per bound also holds. By Theorem 2.1(i), (21) and (22) combined with the 
partial LDP imply the result. □ 

Corollary 4.1. For k>2, the family of the laws of P k {9) satisfies an 
LDP on [0, 1] with speed 9 and rate function 

(23) 4(x) = ( lo §T J ^' *e[0,i/*], 

[ 00, else. 
PROOF. For any k > 2, define 

<Pk- V fc — ► [0, 1], (pi,P2, ■ ■ ■ ,Pk) ~^Pk- 

Clearly, 4> k is continuous, and Theorem 4.3 combined with the contraction 
principle implies that the law of P k (9) satisfies an LDP on [0, 1] with speed 
9 and rate function 

I'(p) =mf{S k (pi,...,p k ) :pi>-- - >p k =p}- 
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For p > 1/k, the infimum is over empty set and is thus infinity. For p in 
[0, 1/k], the infimum is achieved at the point p\ =p% = ■ ■ ■ = p k = P- Hence, 
I'(j>) = I kip) and the result follows. □ 

The result of this corollary indicates that, for any k > 1, the law of kP k (9) 
has the same LDP as the law of P\(0). More precisely, for any p G [0, 1], one 
has 



(24) 



lim lim P{me)- P \<s) 

P{\P k (9) -p/k\< 5} 



1. 



Hence, when 9 becomes large, P k i9) behaves like ^P\(9) at the large devi- 
ation scale. In other words, under a large deviation, the proportion of the 
most likely alleles is k times of the proportion of the kth most likely alleles. 

This relation is also reflected somewhat in Theorem 4.1 and (12). We 
illustrate this through the following nonrigorous derivation with (3(9) defined 
in (13): 

P{Pi (0) G dx} = P{Y 1 (9) + /3{0) G 6 dx} 
(25) ^P{Y 1 +(3(9) e6dx} 



cxp 



X + 



(3(9) 1 



+ 



i p ~e{x+m/e) 



dx 



and 



P{P k (0) G dy} = P{Y k (9) + (3(9) G 9dy} 
(26) ^P{k(Y k + (3(9))£0d(ky)} 



exp 



x + 



13(e) 



+ 



j_-e/k(x+m/o) 



k9 k9 



dx, 



x = ky. 



Comparing the last terms in (25) and (26), we can see that at the expo- 
nential scale kP k (9) is like P\(9). 

Now we turn to the LDP of PD(9). Let 



(27) 



V = <^ (pup 2 , ...):pi>P2>--->0,Y^ Pk <l 



k=l 



be the closure of V equipped with the subspace topology of R°°. Let 



(28) 



Eg be the law of P(9) on space V. 



Theorem 4.4. The family {Eq:9 > 0} satisfies an LDP with speed 9 
and rate function 
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(29) 5(p) = | ^I-eLp,' ^' P2 ' • • ° ^ < 



Proof. Because V is compact, by Theorem 2.1, it suffices to verify (1) 
for the family {Zq:6 > 0}. The topology on V can be generated by the 
following metric: 



d(p,q) = J2 



OO I I 

Pk~Qk\ 



k=l Z 

where p = (pi,P2, •■•),q = (<7i> <?2> • • •)• For any fixed 8 > 0, let B(p, 8) and 
B(p, 8) denote the respective open and closed balls centered at p with radius 
8 > 0. Set ns = 1 + [log 2 (l/<5)], where [x] denotes the integer part of x. Set 

V ns (p;8/2) = {(qi,q 2 ,---) 6 V:\qk~Pk] < 5/2, k = l,...,n s }. 
Then we have 

V nt (p;6/2)cB(p,S) 
By Theorem 4.3 and the fact that 

Ze{V ns (p; 8/2)} = Z ne , e {V(( Pl , . . . tPns );S/2)}, 



we get that 



liminf ilogS e {5(p,<5)} 



(30) > lim inf - log E ns>e {V(( Pl ,...,pn 6 ); 5/2) } 

a— >oo u 

> -Sn S (PU ■ ■ ■ ,Png) > ~S(p). 

On the other hand, for any fixed n > 1, Si > 0, let 

^n(p;<*i) ={(9l)92)--0 G V: < (5i,/c = l,...,n}. 

Then we have 

H e {l7 n (p; 5i)} = S Tl) e{J7((pi, . . . ,p n y,$i)}, 
and, for 8 small enough, 

B(p,8)cU n (p;8 1 ), 
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which implies that 



(31) 



lim lim sup - log Eg{B(p, 5)} 
< limsup^logH n . e {?7((pi, 



< -mf{S n (qi,...,q n ):(qi,...,q n ) G U({pi, . . . ,p n ),5i)}. 
Letting 5\ go to zero, and then n go to infinity, we get 



(32) 



limlimsup^logE e {£(p,(5)} < -S(p), 



which combined with (30) implies the result. □ 
Remark. Note that the effective domain is 



and 



{p£V: 5(p) < oo} = I p G V : £ p k < 1 
I k=i 



lim inf S(p) = oo. 



On the other hand, since 



EgipE V: 



fc=l 



<5 



S 9 PGV: 



one has 



lim lim inf — log Ha < p G V : 

<5^o e^oo 9 



fc=i 



CX) 

fc=l 



<5 



<5 



lim lim sup - log Eg < p £ V : 



fc=i 



<5 



o. 



This might at first sight appear to be a contradiction. However, since the 
function J2%%iPk is not continuous on V, the set {p: | YlT=iPk — 1| < is 
not closed and there is no inconsistency. 

5. Applications. In this section we will discuss two applications of The- 
orem 4.4. The first one is the LDP for the homozygosity. 

A random sample of size m > 1 is selected from a population whose allelic 
types have distribution PD(9). The probability that all samples are of the 
same type is called the rath order population homozygosity and is given by 



(33) 



i=l 



i=l 
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Dince H m {9) < Pi (6), it follows that H m (9) converges to zero as 9 ap- 
proaches to infinity. In [8] it is shown that H m {9) converges to 1 in 

probability, that is, H m {9) goes to zero at a magnitude of gi^i ■ Our next 
theorem describes the large deviations of H m {6) from zero. 

Theorem 5.1. The law of H m (9) for m > 1 satisfies an LDP on [0, 1] 
with speed 9 and rate function I(y 1 l m ), where !{■) is given by (3). 



Proof. For m > 1 the map 

oo 

^:V — [0,1], P^Ep™ 

k=l 

is continuous. By Theorem 4.4 and the contraction principle, it follows that 
the law of H m {9) satisfies an LDP with speed 9 and rate function 

%) = inf{S(p): P ev,<Mp) = y}- 

Since for any p in V, we have 

oo 

Eft>(^(p)) 1/m =!/ 1/m - 

k=l 

it follows that 5(p) > I(y 1/m ) and, thus, I(y) > I(y 1/m ). On the other 
hand, choosing p = (y 1 /™, 0, . . .), one gets that I(y) < /(y 1 /™). Hence, I(y) = 
/(y 1 /" 1 ), and the result follows. □ 

The study of fluctuations of homozygosity goes back to Griffiths [6]. It 
was shown in [6] that 

(34) 6 J^ [H2{6 )-E(H 2 (9))]^Z, 
where Z is the standard normal random variable. 



Remark. It is interesting to note that the relation between the large 
deviation Theorem 5.1 and the "central limit theorem" (34) is qualitatively 
different from the corresponding relation in the classical case of partial sums 
of i.i.d. random variables. In the latter case the speed in the large deviation 
result is the same as that for the normal approximation and only the rate 
functions are different. In contrast, for the case of Pi2(9), the speed in the 
large deviation result is 9 whereas for the normal approximation, E(H2{9)) + 
^§f it would be 9 3 . 
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Joyce, Krone and Kurtz [8] obtained the following generalization of (34) 
to the mth order homozygosity: 



(35) 



Tim 



■H m (9)-l)^Z(m), 



where Z(m) is a normal random variable with mean zero and variance 
P^ra] ~~ m2 - One can rewrite (35) as 



(36) 



-1/2 



r(m) 



iH m (9)-E[H m (6)])^Z(m), 



which includes (34) as a special case. Thus, we have two different laws of 
large numbers and two different central limit theorems: the convergence of 
H m (6) to zero and the fluctuations around the mean, and the convergence of 
Y^jH m (9) to one and the associated fluctuations. One can easily go from 
one to the other by a simple algebraic transformation. 

The LDP obtained in Theorem 5.1 is associated with the convergence 
of H m {6) to zero. It is thus natural to expect obtaining an LDP result for 
convergence of Yn^\H m (ff) to one from Theorem 5.1. Unfortunately, we are 
unable to verify this, but have the following partial information about the 
possible candidate for the LDP speed if there is one. 

Assume that an LDP with speed a(9) and rate function /(•) holds for the 
convergence of r 7 \ H m {9) to one. Then for any constant c> 0, 

m— 1 



P 



r(m) 



H m {9)>l + c}>P 



r(m) 



(37) 



P<X, > 



X™ > 1 + c 



r(m)(l + c) 



l/m 



which implies that 
(38) 



inf I(x) 

x>l+c 



(r(m)(i + c)) 1 / m V (m ' 1)/ 

Q(m— l)/m 

a{9) 



if lim 

6»->oo 



?l/m 



CO. 



Since c is arbitrary, /(•) is zero over a sequence that goes to infinity, which 
contradicts the fact that {x : I(x) < M} is compact for every positive M. 
Hence, the LDP speed cannot grow faster than 9 1 / m . 



Our second application involves the Poisson-Dirichlet distribution with 
selection. 
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Let Cb(V) be the set of all bounded continuous functions on V. Assume 
that a(9) satisfy either lim^oo = or lim^oo = c for some c > 0. 
For every H in C&(V), define a new probability on V as 

e a(9)H( P ) 

( 39 ) E *A d P) = E E e[e a(e)H^f s(dp). 

Then we have the following: 

Theorem 5.2. The family satisfies an LDP with speed 9 and 

rate function 



(40) S a (p) 



where 



S(p), i/lim^ = 0, 
S°(p), i fi im ^l = c> o ) 

6— »oo U 



(41) S c (p) = sup{c#(q) - 5(q)} - (cff (p) - 5(p)). 

q 

Proof. By Theorem 2.2 and Theorem 4.4, 
lim hogE Se [e a W H W] 




hm Ilo gj B s «[e e ( a ^/ e ) H (P)] 

if lim ^1 = 0, 

e^oo 9 

sup{c#(q) - S(q)}, if lim ^ = c> 0. 

0— >oo p 

This, combined with the continuity of , implies that, for any p in V, 
Hminfliminf-logHj e {d(p,q) <<5} 

> liminf liminf(^P(if(p) - 5') + ilogS fl {d(p,q) < 6}} 
<5— >0 9^oc { v u J 

_(»• ^jf^ 

I sup{c#(q) - S(q)}, if lim ^ = c> 0, 

^ q y— >oo (7 



> 



-5(p), if lim ^ = 0, 
-5 c (p), ifa(#)=c#>0, 
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where 5' converges to zero as 5 goes to zero. Similarly, we have 
limsuplimsup^logH^ {d(p,q) < 5} 



< 



lim sup lim sup( ^ (H(p) + 5') + \ log E e {d(p, q) < 5} } 

"I 



0, if lim ^ = 0, 

sup{c#(q) - S(q)}, if lim ^ = c> 0, 

q 6—>oo (7 



</-5(p), if lim ^ = 0, 
{-S c {p), ifa(#) = c#>0. 
Since V is compact, the result follows by an application of Theorem 2.1. 

□ 

Theorem 5.3. Assume that 
(42) lim ^ = oo 

0->oo 

and the maximum of H is achieved at a single point po- Then the family 
{E^ g } satisfies an LDP with speed 9 and rate function 



(43) S°°(p) = { 



0, z/p = p , 
oo, else. 



PROOF. Without loss of generality, we assume that sup pg y H{p) = 0. 
Otherwise we can multiply both the numerator and the denominator by 
e ~a(e)H( P0 ) in the definition of E^ 9 . 

For any p 7^ po , choose 5 small enough such that 

di = sup H(q) <2d 2 = 2 inf H(q) < 0. 

rf(p,q)<<5 d(p ,q)<<5 

Then by direct calculation, we get 

limsupJlogSj 9 {d(p,q) < 5} 



r 1 , J{rf(p,q)< 

lim sup — log — — 



W q H^ )H(q) ^q) 



E E e [ e a(e)H(q) 



1 , 

< hm sup - log 

= —00 



a a(8)(di-<h) 



E e {d(p,n)<6} 
E {d(p o ,q) <5} 
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and 



Uminf-logH fl {d(p ,q) <<$} 

H — irv~\ H 







= by choosing small enough 5± , 



where 



a(5) 



sup H(q L ),b(S 1 ) 



d(po,q)<Si 



inf iT(q). 



^(po,q)>5 



This, combined with the compactness of V and Theorem 2.1, implies the 
result. □ 

In [5] , simulations were done for several models to study the role of pop- 
ulation size in population genetical models of molecular evolution. One of 
the models is an infinite-alleles model with selective overdominance or het- 
erozygote advantage. It was observed and conjectured that, if the selection 
intensity and the mutation rate get large at the same speed, the behavior 
looks like that of a neutral model. A rigorous proof of this conjecture was 
included in [9]. Using our notation with </> m (p) = YlkLiPk ''■> their result can 
be stated as follows. 

Theorem 5.4 (Joyce, Krone and Kurtz). Choosing a(9) = c9 3 / 2+ ~< and 
H(p) = — 02 (p) i n (39), then, under Eg, as 9 — > oo, 



where =>■ denotes the weak convergence and 2(2) is a normal random vari- 
able with mean zero and variance 2. 

Now choosing H(p) = — 02 (p) in Theorem 5.2 and Theorem 5.3, the next 
corollary gives an alternate proof of Gilliespie's conjecture. 



(44) 




ifl < 
if 7 = 
i/ 7 >0 
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Corollary 5.1. The family satisfies an LDP with speed 9 and 
rate function 



(45) 



S a (p) = 



S( P ), 
S°°(p), 



it lim — — 

n . a 



= c>0, 

OO (7 



m — — = oo. 

-+00 9 



From Corollary 5.1, it follows that the selection cannot be detected at 
large deviation level for a(9) = o{0). In other words, a phase transition occurs 
at the critical scale 9 which is different from the critical scale in (44). 

In a recent paper, Joyce and Gao [7] studied the infinite- alleles model with 
homozygote advantage. This corresponds to choosing H(p) = ^(p) i n (39). 
A critical phenomenon is shown to exist in this case. They even obtained 
the following corollary to Theorem 5.2. 

Corollary 5.2. Choosing H(p) = </> 2 (p) in (39), then the family 
satisfies an LDP with speed 6 and rate function 



S(P), 

-cH(p) + S(p) 



(46) S a (p) = l 



log 



i-yr^2A 



+ c 



-cH(p) + S(p), 
where cq > 2 solves the equation 



if km — — = c<c , 



V ) lm -7T- = O CO , 



log |W^U c ri±43£V = o. 
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